A missing-word test comparison of human and statistical language model performance

نویسندگان

  • Marie Owens
  • Anja Kürger
  • Paul Gerard Donnelly
  • Francis Jack Smith
  • Ji Ming
چکیده

A suite of missing-word tests based on text extracts selected randomly from two different text corpora provided a metric which was used in an evaluation of human performance, an evaluation of language model performance and a cross-comparison of the performances. The effects of providing different sizes of context for the missing word (ranging from two words to three sentences) were examined and two main patterns became clear from the results: • surprisingly, for tests where the language model was able to take advantage of all the context information provided (i.e. where the context consisted of just a few words) it outperformed humans; • conversely, humans outperformed the language model when the size of context given for the missing word exceeded the size, which the language model could usefully, employ in its probability calculations (typically more than six words).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing the Persian version of the homophone meaning generation test

Background: Finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. The Homophone Meaning Generation Test (HMGT) can measure the ability to switch between verbal concepts, which is required in word retrieval. The purpose of this study was to adapt and validate the Persian ve...

متن کامل

Performance evaluation of different estimation methods for missing rainfall data

There are numerous methods to estimate missing values of which some are used depending on the data type and regional climatic characteristics. In this research, part of the monthly precipitation data in Sarab synoptic station, east Azerbaijan province, Iran was randomly considered missing values. In order to study the effectiveness of various methods to estimate missing data, by seven classic s...

متن کامل

Schemata-Building Role of Teaching Word History in Developing Reading Comprehension Ability

Methodologically, vocabulary instruction has faced significant ups and downs during the history of language education; sometimes integrated with the other elements of language network, other times tackled as a separate component. Among many variables supposedly affecting vocabulary achievement, the role of teaching word history, as a schemata-building strategy, in developing reading comprehensi...

متن کامل

A method to solve the problem of missing data, outlier data and noisy data in order to improve the performance of human and information interaction

Abstract Purpose: Errors in data collection and failure to pay attention to data that are noisy in the collection process for any reason cause problems in data-based analysis and, as a result, wrong decision-making. Therefore, solving the problem of missing or noisy data before processing and analysis is of vital importance in analytical systems. The purpose of this paper is to provide a metho...

متن کامل

A Comparative Review of Selection Models in Longitudinal Continuous Response Data with Dropout

Missing values occur in studies of various disciplines such as social sciences, medicine, and economics. The missing mechanism in these studies should be investigated more carefully. In this article, some models, proposed in the literature on longitudinal data with dropout are reviewed and compared. In an applied example it is shown that the selection model of Hausman and Wise (1979, Econometri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999